Processing and Recognition of Handwritten Documents
نویسندگان
چکیده
Nowadays, the accurate recognition of machine printed characters is considered largely a solved problem. A lot of commercial products are focused towards that direction, achieving high recognition rates. However, handwritten character recognition is comparatively difficult. So, the recognition of handwritten documents is still a subject of active research. In this thesis we studied the processing and focused on the recognition stages for handwritten optical character recognition. At the recognition stage a feature vector is extracted for all extracted characters in order to classify them to predefined classes using machine learning techniques. We studied several feature extraction techniques and developed methodologies that efficiently combine different types of features. Furthermore, a novel methodology that extracts features and classifies characters using a hierarchical scheme is proposed. This methodology, after being tested on wellknown character databases, as well as on databases consisting of characters from historical documents and a database consisting of Greek contemporary handwritten characters, that were particularly created in this thesis, achieved recognition rates that are among the best one can find in the literature. This methodology was also applied to cursive handwritten words. The recognition rates in these experiments were also very high. Finally, an algorithm that automatically estimates the free parameters involved in character segmentation is also suggested. Character segmentation is very important because its result affects directly the recognition rates. Thus, the optimal segmentation is essential for a successful recognition.
منابع مشابه
Neural Network Based Recognition System Integrating Feature Extraction and Classification for English Handwritten
Handwriting recognition has been one of the active and challenging research areas in the field of image processing and pattern recognition. It has numerous applications that includes, reading aid for blind, bank cheques and conversion of any hand written document into structural text form. Neural Network (NN) with its inherent learning ability offers promising solutions for handwritten characte...
متن کاملUse of the Shearlet Transform and Transfer Learning in Offline Handwritten Signature Verification and Recognition
Despite the growing growth of technology, handwritten signature has been selected as the first option between biometrics by users. In this paper, a new methodology for offline handwritten signature verification and recognition based on the Shearlet transform and transfer learning is proposed. Since, a large percentage of handwritten signatures are composed of curves and the performance of a sig...
متن کاملOff-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model
In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...
متن کاملExploiting Collection Level for Improving Assisted Handwritten Words Transcription of Historical Documents
Transcription of handwritten words in historical documents is still a difficult task. When processing huge amount of pages, document centered approaches are limited by the trade-off between automatic recognition errors and the tedious aspect of human user annotation work. In this article, we investigate the use of inter page dependencies to overcome those limitations. For this, we propose a new...
متن کاملIsolated Persian/Arabic handwriting characters: Derivative projection profile features, implemented on GPUs
For many years, researchers have studied high accuracy methods for recognizing the handwriting and achieved many significant improvements. However, an issue that has rarely been studied is the speed of these methods. Considering the computer hardware limitations, it is necessary for these methods to run in high speed. One of the methods to increase the processing speed is to use the computer pa...
متن کاملCITlab ARGUS for historical handwritten documents
We describe CITlab’s recognition system for the HTRtS competition attached to the 14. International Conference on Frontiers in Handwriting Recognition, ICFHR 2014. The task comprises the recognition of historical handwritten documents. The core algorithms of our system are based on multidimensional recurrent neural networks (MDRNN) and connectionist temporal classification (CTC). The software m...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011